Many web sites and advertisement placement services generate considerable revenue from the placement of advertisements. The revenue model for many web sites is a clickthrough model in which an advertiser pays for placement of the advertisement only when a user clicks on the advertisement. The advertiser and the web site provider both have incentives to ensure that advertisements are likely to be of interest to the user of the web page. If the advertisement is not of interest, then the user is unlikely to click on the advertisement. For example, if the web page relates to the locations of basketball courts provided by a city and the advertisement relates to buying flowers, the user interested in the location of basketball courts is unlikely to be interested in buying flowers. If the user does not click on the advertisement, the web site provider loses revenue it might have received if the advertisement had been of interest to the user. If the user does click on the advertisement, the advertiser will pay for the advertisement even though the advertiser is unlikely to generate revenue from that placement because the user is unlikely to purchase flowers.
To help ensure that advertisements may be of interest to the user of a web page, advertisements are selected based on relevance to the content of the web page. To help ensure that advertisements are related to the content of a web page, the advertisers may specify a target word for placing an advertisement. If a web page is related to the target word, then the advertisement may be assumed to be related to the content of the web page. For example, an advertiser who is advertising basketball shoes may specify target words of “basketball shoe,” “basketball court,” and “basketball.” The advertiser may be willing to pay more for the advertisement when it is placed on a web page that contains the target word “basketball shoes” than the other two because it is more specific to the product being advertised.
Advertisements are often placed on display pages (e.g., web pages) for online discussions such as instant messaging sessions, discussion threads, web logs (“blogs”), and so on. Advertisements that relate in some way to the topic of an online discussion are generally effective when placed with the online discussion. However, it probably would not be effective to place an advertisement for courtside tickets for a basketball game with an online discussion relating to analysis of opinions of the U.S. Supreme Court even though the advertisement and online discussion are related in some way to the keyword “court.” An advertisement relating to online access to briefs filed with the Supreme Court is related to the topic of the online discussion. Such an advertisement is likely to be more effective than an advertisement for courtside tickets. The effectiveness of the advertisements for online discussions depends in large part on the effectiveness of identifying the topics of the online discussions. Although several attempts have been made to identify the topics of online discussions, these attempts have not proved to be completely satisfactory.
Identifying the topics of online discussions is also useful in many applications other than the placement of advertisements. For example, if online discussions are categorized according to their topics, users can browse the categories to locate online discussions of interest. As another example, a search engine service for online discussions may input a query and output an indication of online discussions that match the query. The search engine service may rank the matching online discussions higher when the topics of the online discussion match the terms of the query. For example, if the query is “supreme court,” then a matching online discussion whose topics include “courts” would have its ranking increased. Another example of an application that uses the topic of an online discussion is the generating of discussion summaries. A summary of an online discussion may be generated by selecting the most relevant sentences to the topics of the online discussion. The relevance of a sentence may be based in part on whether the sentence contains a word relating to a topic of the discussion.